Search CORE

16 research outputs found

Detecting the Sensing Area of A Laparoscopic Probe in Minimally Invasive Cancer Surgery

Author: Elson Daniel S.
Giannarou Stamatia
Hu Yicheng
Huang Baoru
Nguyen Anh
Publication venue
Publication date: 07/07/2023
Field of study

In surgical oncology, it is challenging for surgeons to identify lymph nodes and completely resect cancer even with pre-operative imaging systems like PET and CT, because of the lack of reliable intraoperative visualization tools. Endoscopic radio-guided cancer detection and resection has recently been evaluated whereby a novel tethered laparoscopic gamma detector is used to localize a preoperatively injected radiotracer. This can both enhance the endoscopic imaging and complement preoperative nuclear imaging data. However, gamma activity visualization is challenging to present to the operator because the probe is non-imaging and it does not visibly indicate the activity origination on the tissue surface. Initial failed attempts used segmentation or geometric methods, but led to the discovery that it could be resolved by leveraging high-dimensional image features and probe position information. To demonstrate the effectiveness of this solution, we designed and implemented a simple regression network that successfully addressed the problem. To further validate the proposed solution, we acquired and publicly released two datasets captured using a custom-designed, portable stereo laparoscope system. Through intensive experimentation, we demonstrated that our method can successfully and effectively detect the sensing area, establishing a new performance benchmark. Code and data are available at https://github.com/br0202/Sensing_area_detection.gitComment: Accepted by MICCAI 202

arXiv.org e-Print Archive

Language-driven Scene Synthesis using Multi-conditional Diffusion Model

Author: Huang Baoru
Nguyen Anh
Nguyen Dzung
Nguyen Toan Tien
Vo Thieu
Vu Minh Nhat
Vuong An
Publication venue
Publication date: 24/10/2023
Field of study

Scene synthesis is a challenging problem with several industrial applications. Recently, substantial efforts have been directed to synthesize the scene using human motions, room layouts, or spatial graphs as the input. However, few studies have addressed this problem from multiple modalities, especially combining text prompts. In this paper, we propose a language-driven scene synthesis task, which is a new task that integrates text prompts, human motion, and existing objects for scene synthesis. Unlike other single-condition synthesis tasks, our problem involves multiple conditions and requires a strategy for processing and encoding them into a unified space. To address the challenge, we present a multi-conditional diffusion model, which differs from the implicit unification approach of other diffusion literature by explicitly predicting the guiding points for the original data distribution. We demonstrate that our approach is theoretically supportive. The intensive experiment results illustrate that our method outperforms state-of-the-art benchmarks and enables natural scene editing applications. The source code and dataset can be accessed at https://lang-scene-synth.github.io/.Comment: Accepted to NeurIPS 202

arXiv.org e-Print Archive

Open-Vocabulary Affordance Detection using Knowledge Distillation and Text-Point Correlation

Author: Huang Baoru
Le Ngan
Nguyen Anh
Nguyen Toan
Van Vo Tuan
Vo Thieu
Vu Minh Nhat
Publication venue
Publication date: 19/09/2023
Field of study

Affordance detection presents intricate challenges and has a wide range of robotic applications. Previous works have faced limitations such as the complexities of 3D object shapes, the wide range of potential affordances on real-world objects, and the lack of open-vocabulary support for affordance understanding. In this paper, we introduce a new open-vocabulary affordance detection method in 3D point clouds, leveraging knowledge distillation and text-point correlation. Our approach employs pre-trained 3D models through knowledge distillation to enhance feature extraction and semantic understanding in 3D point clouds. We further introduce a new text-point correlation method to learn the semantic links between point cloud features and open-vocabulary labels. The intensive experiments show that our approach outperforms previous works and adapts to new affordance labels and unseen objects. Notably, our method achieves the improvement of 7.96% mIOU score compared to the baselines. Furthermore, it offers real-time inference which is well-suitable for robotic manipulation applications.Comment: 8 page

arXiv.org e-Print Archive

Detecting the Sensing Area of a Laparoscopic Probe in Minimally Invasive Cancer Surgery

Author: Elson Daniel S
Giannarou Stamatia
Hu Yicheng
Huang Baoru
Nguyen Anh
Publication venue: Springer Nature Switzerland
Publication date: 01/10/2023
Field of study

University of Liverpool Repository

Spiral - Imperial College Digital Repository

Language-Conditioned Affordance-Pose Detection in 3D Point Clouds

Author: Huang Baoru
Le Bac
Le Ngan
Nguyen Anh
Nguyen Toan
Truong Vy
Van Vo Tuan
Vo Thieu
Vu Minh Nhat
Publication venue
Publication date: 19/09/2023
Field of study

Affordance detection and pose estimation are of great importance in many robotic applications. Their combination helps the robot gain an enhanced manipulation capability, in which the generated pose can facilitate the corresponding affordance task. Previous methods for affodance-pose joint learning are limited to a predefined set of affordances, thus limiting the adaptability of robots in real-world environments. In this paper, we propose a new method for language-conditioned affordance-pose joint learning in 3D point clouds. Given a 3D point cloud object, our method detects the affordance region and generates appropriate 6-DoF poses for any unconstrained affordance label. Our method consists of an open-vocabulary affordance detection branch and a language-guided diffusion model that generates 6-DoF poses based on the affordance text. We also introduce a new high-quality dataset for the task of language-driven affordance-pose joint learning. Intensive experimental results demonstrate that our proposed method works effectively on a wide range of open-vocabulary affordances and outperforms other baselines by a large margin. In addition, we illustrate the usefulness of our method in real-world robotic applications. Our code and dataset are publicly available at https://3DAPNet.github.ioComment: Project page: https://3DAPNet.github.i

arXiv.org e-Print Archive

CathSim: An Open-source Simulator for Autonomous Cannulation

Author: Abdelaziz Mohamed E. M. K.
Baena Ferdinando Rodriguez y
Berthet-Rayne Pierre
Fichera Sebastiano
Huang Baoru
Jianu Tudor
Lee Chun-Yi
Nguyen Anh
Vu Minh Nhat
Publication venue
Publication date: 02/08/2022
Field of study

Autonomous robots in endovascular operations have the potential to navigate circulatory systems safely and reliably while decreasing the susceptibility to human errors. However, there are numerous challenges involved with the process of training such robots such as long training duration due to sample inefficiency of machine learning algorithms and safety issues arising from the interaction between the catheter and the endovascular phantom. Physics simulators have been used in the context of endovascular procedures, but they are typically employed for staff training and generally do not conform to the autonomous cannulation goal. Furthermore, most current simulators are closed-source which hinders the collaborative development of safe and reliable autonomous systems. In this work, we introduce CathSim, an open-source simulation environment that accelerates the development of machine learning algorithms for autonomous endovascular navigation. We first simulate the high-fidelity catheter and aorta with the state-of-the-art endovascular robot. We then provide the capability of real-time force sensing between the catheter and the aorta in the simulation environment. We validate our simulator by conducting two different catheterisation tasks within two primary arteries using two popular reinforcement learning algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). The experimental results show that using our open-source simulator, we can successfully train the reinforcement learning agents to perform different autonomous cannulation tasks

arXiv.org e-Print Archive

Language-driven Scene Synthesis using Multi-conditional Diffusion Model

Author: Huang Baoru
Nguyen Anh
Nguyen Dzung
Nguyen Toan
Vo Thieu
Vu Minh
Vuong An
Publication venue
Publication date
Field of study

University of Liverpool Repository

Simultaneous Depth Estimation and Surgical Tool Segmentation in Laparoscopic Images

Author: Elson Daniel S
Giannarou Stamatia
Huang Baoru
Mayer Erik
Nguyen Anh
Tuch David
Vyas Kunal
Wang Siyao
Wang Ziyang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2022
Field of study

Surgical instrument segmentation and depth estimation are crucial steps to improve autonomy in robotic surgery. Most recent works treat these problems separately, making the deployment challenging. In this paper, we propose a unified framework for depth estimation and surgical tool segmentation in laparoscopic images. The network has an encoder-decoder architecture and comprises two branches for simultaneously performing depth estimation and segmentation. To train the network end to end, we propose a new multi-task loss function that effectively learns to estimate depth in an unsupervised manner, while requiring only semi-ground truth for surgical tool segmentation. We conducted extensive experiments on different datasets to validate these findings. The results showed that the end-to-end network successfully improved the state-of-the-art for both tasks while reducing the complexity during their deployment

University of Liverpool Repository

PubMed Central

Spiral - Imperial College Digital Repository

A Novel Medical Image Watermarking in Three-dimensional Fourier Compressed Domain

Author: Baoru Han
Jingbing Li
Mengxing Huang
Publication venue: Bulgarian Academy of Sciences
Publication date
Field of study

Digital watermarking is a research hotspot in the field of image security, which is protected digital image copyright. In order to ensure medical image information security, a novel medical image digital watermarking algorithm in three-dimensional Fourier compressed domain is proposed. The novel medical image digital watermarking algorithm takes advantage of three-dimensional Fourier compressed domain characteristics, Legendre chaotic neural network encryption features and robust characteristics of differences hashing, which is a robust zero-watermarking algorithm. On one hand, the original watermarking image is encrypted in order to enhance security. It makes use of Legendre chaotic neural network implementation. On the other hand, the construction of zero-watermarking adopts differences hashing in three-dimensional Fourier compressed domain. The novel watermarking algorithm does not need to select a region of interest, can solve the problem of medical image content affected. The specific implementation of the algorithm and the experimental results are given in the paper. The simulation results testify that the novel algorithm possesses a desirable robustness to common attack and geometric attack

Directory of Open Access Journals